Simulating physical network paths (e.g., Internet) is a cornerstone research problem in the emerging sub-field of AI-for-networking. We seek a model that generates end-to-end packet delay values in response to the time-varying load offered by a sender, which is typically a function of the previously output delays. The problem setting is unique, and renders the state-of-the-art text and time-series generative models inapplicable or ineffective. We formulate an ML problem at the intersection of dynamical systems, sequential decision making, and time-series modeling. We propose a novel grey-box approach to network simulation that embeds the semantics of physical network path in a new RNN-style model called RBU, providing the interpretability of standard network simulator tools, the power of neural models, the efficiency of SGD-based techniques for learning, and yielding promising results on synthetic and real-world network traces.
translated by 谷歌翻译
图形神经网络(GNNS)从节点功能和输入图拓扑中利用信号来改善节点分类任务性能。然而,这些模型倾向于在异细胞图上表现不良,其中连接的节点具有不同的标记。最近提出了GNNS横跨具有不同程度的同性恋级别的图表。其中,依赖于多项式图滤波器的模型已经显示了承诺。我们观察到这些多项式图滤波器模型的解决方案也是过度确定的方程式系统的解决方案。它表明,在某些情况下,模型需要学习相当高的多项式。在调查中,我们发现由于其设计而在学习此类多项式的拟议模型。为了缓解这个问题,我们执行图表的特征分解,并建议学习作用于频谱的不同子集的多个自适应多项式滤波器。理论上和经验证明我们所提出的模型学习更好的过滤器,从而提高了分类准确性。我们研究了我们提出的模型的各个方面,包括利用潜在多项式滤波器的依义组分的数量以及节点分类任务上的各个多项式的性能的依赖性。我们进一步表明,我们的模型通过在大图中评估来扩展。我们的模型在最先进的模型上实现了高达5%的性能增益,并且通常优于现有的基于多项式滤波器的方法。
translated by 谷歌翻译
Building segmentation in high-resolution InSAR images is a challenging task that can be useful for large-scale surveillance. Although complex-valued deep learning networks perform better than their real-valued counterparts for complex-valued SAR data, phase information is not retained throughout the network, which causes a loss of information. This paper proposes a Fully Complex-valued, Fully Convolutional Multi-feature Fusion Network(FC2MFN) for building semantic segmentation on InSAR images using a novel, fully complex-valued learning scheme. The network learns multi-scale features, performs multi-feature fusion, and has a complex-valued output. For the particularity of complex-valued InSAR data, a new complex-valued pooling layer is proposed that compares complex numbers considering their magnitude and phase. This helps the network retain the phase information even through the pooling layer. Experimental results on the simulated InSAR dataset show that FC2MFN achieves better results compared to other state-of-the-art methods in terms of segmentation performance and model complexity.
translated by 谷歌翻译
识别有影响力的培训示例的能力使我们能够调试培训数据并解释模型行为。现有的技术是基于通过模型参数来影响训练数据影响的。对于NLP应用中的大型模型,在所有模型参数中研究此流程通常是不可行的,因此技术通常选择重量的最后一层。但是,我们观察到,由于激活连接到最后一层的权重包含``共享逻辑'',因此通过最后一层权重计算的数据容易``取消效应'',其中不同示例的数据影响不同的示例的数据影响彼此相矛盾的大级级。取消效应降低了影响评分的歧视力,并且根据此措施删除有影响力的例子通常不会太多改变模型的行为。为了减轻这种情况,我们提出了一种称为Tracin的技术,我们可以修改一种称为Tracin的方法,可以在嵌入层而不是最后一层中进行操作,在该层中,取消效果不太严重。一个潜在的问题是,基于单词嵌入层的影响可能无法编码足够的高级信息。但是,我们发现梯度(与嵌入不同)不会遭受这一影响,这可能是因为它们通过较高的层链。我们表明,在三个语言分类任务上,在案例删除评估上,Tracin-We明显优于4-10在上一层上应用的其他数据影响的其他数据影响方法。此外,Tracin-We不仅可以在整体培训输入水平上产生分数,而且还可以在培训输入中的单词水平上产生分数,这是进一步的调试。
translated by 谷歌翻译
言语分离的许多最近进步主要针对具有高重叠程度的短音频话语的合成混合物。这些数据集与真实的会话数据显着不同,因此,在这些数据集上培训和评估的模型不会概括到真实的会话方案。使用大多数这些模型用于长形式语音的另一个问题是由于时间频率掩模或置换不变训练(PIT)损耗的无监督聚类,因此是分离的语音段的非明确顺序。这导致准确地缝合用于自动语音识别(ASR)的下游任务的均匀扬声器段。在本文中,我们提出了一种扬声器调节分离器,在直接从混合信号中提取的扬声器嵌入物上训练。我们使用定向丢失训练此模型,该丢失调节分离的段的顺序。使用此模型,我们对真实会话数据的单词错误率(WER)进行了重大改进,而无需额外的重新拼接步骤。
translated by 谷歌翻译
本文介绍了一种用于自主车辆的耦合,神经网络辅助纵向巡航和横向路径跟踪控制器,具有模型不确定性和经历未知的外部干扰。使用反馈误差学习机制,采用利用自适应径向基函数(RBF)神经网络的反向车辆动态学习方案,称为扩展的最小资源分配网络(EMRAN)。 EMRAN使用扩展的卡尔曼滤波器进行在线学习和体重更新,并采用了一种越来越多的/修剪策略,用于维护紧凑的网络,以便更容易地实现。在线学习算法处理参数化不确定性,并消除了未知干扰在道路上的影响。结合用于提高泛化性能的自我调节学习方案,所提出的EMRAN辅助控制架构辅助基本PID巡航和斯坦利路径跟踪控制器以耦合的形式。与传统的PID和斯坦利控制器相比,其对各种干扰和不确定性的性能和鲁棒性以及与基于模糊的PID控制器和主动扰动抑制控制(ADRC)方案的比较。慢速和高速场景介绍了仿真结果。根均线(RMS)和最大跟踪误差清楚地表明提出的控制方案在未知环境下实现自动车辆中更好的跟踪性能的有效性。
translated by 谷歌翻译
We study the problem of attributing the prediction of a deep network to its input features, a problem previously studied by several other works. We identify two fundamental axioms-Sensitivity and Implementation Invariance that attribution methods ought to satisfy. We show that they are not satisfied by most known attribution methods, which we consider to be a fundamental weakness of those methods. We use the axioms to guide the design of a new attribution method called Integrated Gradients. Our method requires no modification to the original network and is extremely simple to implement; it just needs a few calls to the standard gradient operator. We apply this method to a couple of image models, a couple of text models and a chemistry model, demonstrating its ability to debug networks, to extract rules from a network, and to enable users to engage with models better.
translated by 谷歌翻译